Use the token bucket algorithm from golang.org/x/time/rate for single-instance rate limiting. For distributed rate limiting across multiple instances, use Redis with atomic Lua scripts.
Token bucket (golang.org/x/time/rate): allows bursts up to bucket capacity, refills at a constant rate
Sliding window: more accurate than fixed window, prevents burst at window boundaries
Per-user rate limits are more fair than per-IP for authenticated APIs — use user ID from JWT
Redis-cell (redis module) or Lua scripts for distributed rate limiting across service instances
Return Retry-After header with 429 responses to help well-behaved clients back off